Using Scale-Space Anisotropic Smoothing for Text Line Extraction in Historical Documents

نویسندگان

  • Rafi Cohen
  • Its'hak Dinstein
  • Jihad El-Sana
  • Klara Kedem
چکیده

This paper presents a novel approach for text line extraction which is based on Gaussian scale space, a dedicated binarization, and an energy minimization framework. It enhances the text lines in the image using multi-scale anisotropic second derivative of Gaussian filter bank at the average height of the text line. It then applies a binarization, which is based on component-tree and is tailored for line extraction. The final stage of the algorithm is based on an energy minimization framework for removing spurious text lines and assigning connected components to lines. We developed two algorithms along these ideas, one for documents with mild curled, generally horizontal lines and the other for multiskew and curled text lines. We tested our approach on various datasets written in different languages at varying range of image quality and received high detection rates, which outperform state-of-the-art algorithms.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Adaptive smoothing: a general tool for early vision

We present a method to smooth a signal-whether it is an intensity image, a range image or a planar curve-while preserving discontinuities. This is achieved by repeatedly convolving the signal with a very small averaging mask weighted by a measure of the signal continuity at each point. The method is extremely attractive since edge detection can be performed after a few iterations, and features ...

متن کامل

Improving edge detection and watershed segmentation with anisotropic diffusion and morphological levellings

Edge preserving smoothing and image simplification is of fundamental importance in a variety of remote sensing applications during feature extraction and object detection procedures. The construction of a pre-processing filtering tool for edge detection and segmentation tasks is still an open matter. Towards this end, this paper brings together two advanced nonlinear scale space representations...

متن کامل

A New Approach for Text Documents Classification with Invasive Weed Optimization and Naive Bayes Classifier

With the fast increase of the documents, using Text Document Classification (TDC) methods has become a crucial matter. This paper presented a hybrid model of Invasive Weed Optimization (IWO) and Naive Bayes (NB) classifier (IWO-NB) for Feature Selection (FS) in order to reduce the big size of features space in TDC. TDC includes different actions such as text processing, feature extraction, form...

متن کامل

Why choosing advanced nonlinear scale space filtering for denoising and simplifying images?

Denoising, edge preserving smoothing and image simplification is of fundamental importance in a variety of image processing and computer vision applications during feature extraction and object detection procedures. The construction of an optimal pre-processing filtering tool for various corner/edge detection and segmentation tasks is still an open matter. Towards this end, here we argue that f...

متن کامل

PAIRED ANISOTROPIC DISTRIBUTION FOR IMAGE SELECTIVE SMOOTHING

‎In this paper‎, ‎we present a novel approach for image selective smoothing by the evolution of two paired nonlinear‎ ‎partial differential equations‎. ‎The distribution coefficient in de-noising equation controls the speed of distribution‎, ‎and is‎ ‎determined by the edge-strength function‎. ‎In the previous works‎, ‎the edge-strength function depends on isotropic‎ ‎smoothing of the image‎...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014